Skip to main content

Core SDK

Installation

The core sdk, dynamofl is available on pypi and you can install it using the following command

pip install dynamofl

Constructor

DynamoFL

DynamoFL(api_key, host?, metadata?)

Initializes the DynamoFL client instance connection. The DynamoFL object is your entrypoint to the rest of the DynamoFL SDK.

Your DynamoFL API key is required when calling this function, as it identifies your account to the DynamoFL server. To create an API key, navigate to the profile page on DynamoFL and generate a new access token.

Method Parameters

api_key | required string

API key attained from UI profile page.


host | _optional string**_

API server identifier, https://api.dynamo.ai by default.


metadata | _optional dict**_

Default metadata to apply to all datasources attached by the instance. Not required for penetration testing or model evaluation.


Returns

DynamoFL object.

Example

# Initialize instance on default dynamofl server.
dfl = DynamoFL('56dfae3d-12aa-4148-988e-b71aebb8a75c')

# Initialize instance against a locally hosted server
dfl = DynamoFL('56dfae3d-12aa-4148-988e-b71aebb8a75c', host='http://localhost:3000')

Models - Local

create_model

create_model(name, architecture, architecture_hf_token?, model_file_path?, model_file_paths?, checkpoint_json_file_path?, peft_config_path?, key?)

Creates a new local model object by uploading a model file. The model object can then be used for running evaluations and tests.

Method Parameters

name | required string

Model name.


architecture | required string

HuggingFace hub id for the architecture to load the model file into into.
Example: "mistralai/mistral-7b-v0.1"


architecture_hf_token | optional string

HuggingFace token for the provided architecture. Required if the model is private or gated on the hub.
Example: "hf_***". Go to hf.co/settings/tokens to generate a new token.


model_file_path | required string

Path to file containing the model weights. Valid file extensions include .pt, .bin and .safetensors.
Example: "my_model_path/pytorch_model.bin"


model_file_paths | optional list

List of paths to files containing sharded model weights. Valid file extensions include .pt, .bin and .safetensors.
Example: ["my_model_path/pytorch_model-00001-of-00002.bin", "my_model_path/pytorch_model-00002-of-00002.bin"]


checkpoint_json_file_path | optional string

Path to json file containing sharded model configuration.
Example: "my_model_path/pytorch_model.bin.index.json"


peft_config_path | optional string

Path to json file containing adapter configuration, necessary if the provided model_file_path points to an adapter model file from PEFT library ("adapter_model.bin").
Example: "my_model_path/adapter_config.json"


key | optional string

Unique model identifier key. Will be autogenerated if not provided.


Returns

LocalModel object.

Example

# Creating a model from a local file upload
model = dfl.create_model(
name="Sheared LLaMA",
architecture="princeton-nlp/Sheared-LLaMA-1.3B",
model_file_path="test_models/pytorch_model.bin",
)

# Creating a model from a local file upload fine-tuned with LoRA (PEFT)
model = dfl.create_model(
name="Sheared LLaMA w/ LoRA",
architecture="princeton-nlp/Sheared-LLaMA-1.3B",
model_file_path="test_models/adapter_model.bin",
peft_config_path="test_models/adapter_config.json",
)

create_hf_model

create_hf_model(name, hf_id, architecture_hf_id?, is_peft?, hf_token?, key?)

Creates a new local model object by providing it's HuggingFace hub id. The model object can then be used for running evaluations and tests.

Method Parameters

name | required string

Model name.


hf_id | required string

HuggingFace hub id for the model.


architecture_hf_id | optional string

HuggingFace hub id for the architecture. Required for PEFT adapter models.


is_peft | optional bool

Boolean to indicate whether the provided hf\_id points to a PEFT adapter model.


hf_token | optional string

HuggingFace token for the provided model/architecture. Required if the model/architecture is private or gated on the hub.
Example: "hf_***". Go to hf.co/settings/tokens to generate a new token.


key | optional string

Unique model identifier key. Will be autogenerated if not provided.


Returns

LocalModel object.

Example

model = dfl.create_hf_model(
name="Llama2 7b Chat",
hf_id="NousResearch/Llama-2-7b-chat-hf"
)

Models - Remote

create_openai_model

create_openai_model(name, api_key, api_instance, key?)

Creates a new remote openai model object. The model object can then be used for running evaluations and tests.

Method Parameters

name | required string

Model name.


api_key | required string

OpenAI API Key to use for the model.


api_instance | required string

Model identifier name from OpenAI.
Example: "gpt3.5-turbo-0125"


key | optional string

Unique model identifier key. Will be autogenerated if not provided.


Returns

RemoteModel object.

Example

model = dfl.create_openai_model(
name="GPT 3.5 Model",
api_instance="gpt-3.5-turbo-0125",
api_key="sk-***"
)

create_azure_openai_model

create_azure_openai_model(name, api_key, api_instance, api_version, model_endpoint, key?)

Creates a new remote Azure OpenAI model object. The model object can then be used for running evaluations and tests.

Method Parameters

name | required string

Model name.


api_key | required string

Azure API Key to use for the model.


api_instance | required string

Model identifier name from Azure OpenAI. Example: "gpt35-turbo"


api_version | required string

Azure OpenAI version to use. Example: "2023-07-01-preview"


model_endpoint | required string

URL string representing model endpoint. Example: "https://abc-azure-openai.openai.azure.com/"


key | optional string

Unique model identifier key. Will be autogenerated if not provided.


Returns

RemoteModel object.

Example

model = dfl.create_azure_openai_model(
name="GPT 3.5 Model",
api_instance="gpt35-turbo",
api_key="***",
api_version="2023-07-01-preview",
model_endpoint="https://abc-azure-openai.openai.azure.com/"
)

create_databricks_model

create_databricks_model(name, api_key, model_endpoint, key?)

Creates a new remote Azure OpenAI model object. The model object can then be used for running evaluations and tests.

Method Parameters

name | required string

Model name.


api_key | required string

Databricks API Key to use for the model.


model_endpoint | required string

Databricks model endpoint. Example: "<host>/serving-endpoints/<some-model-name>/invocations"


key | optional string

Unique model identifier key. Will be autogenerated if not provided.


Returns

RemoteModel object.

Example

model = dfl.create_databricks_model(
name="Databricks Mixtral",
api_key="***",
model_endpoint="<host>/serving-endpoints/<mixtral-model-name>/invocations"
)

create_custom_model

create_custom_model(name, remote_model_endpoint, remote_api_auth_config, request_transformation_expression?, response_transformation_expression?, response_type?, batch_size?, multi_turn_support?, enable_retry?, key?)

Creates a new remote model object by providing its endpoint and authentication details. The model object can then be used for running evaluations and tests. This method is particularly useful for incorporating models that are hosted externally but need to be accessed via DynamoFL platform. For more details on integrating custom models, including request/response formats and JSONata transformations, see the Custom Language Model guide.

Method Parameters

name | required string

Model name.


remote_model_endpoint | required string

The endpoint URL where the remote model is hosted. This URL is used by DynamoFL to send requests to the model.


remote_api_auth_config | required dict

A dictionary containing the authentication configuration needed to connect to the remote model. To learn more about the supported authentication types, see the Supported Authentication Types guide.


request_transformation_expression | optional string

JSONata expression to transform incoming requests. This parameter is optional and only needed if the request format needs to be modified.


response_transformation_expression | optional string

JSONata expression to transform outgoing responses. This parameter is optional and only needed if the response format needs to be adjusted.


response_type | optional string (default: "string")

Defines the expected type of the response from the remote model. Valid types are "string" (default) and "boolean".


batch_size | optional int (default: 1)

Specifies the number of requests that can be sent to the remote model in a single batch. This is useful for models that can process multiple inputs at once for efficiency.


multi_turn_support | optional bool (default: True)

Indicates whether the model supports multi-turn interactions, such as in a conversational AI scenario.


enable_retry | optional bool (default: False)

Enables automatic retries of requests in case of failures. Useful for improving reliability in communication with the remote model.


key | optional string

Unique model identifier key. Will be autogenerated if not provided.

Returns

RemoteModelEntity: An object representing the remote model integrated into DynamoFL.

Example

from dynamofl.entities import AuthTypeEnum

model = dfl.create_custom_model(
name="External AI Model",
remote_model_endpoint="https://api.externalmodel.com/chat/completions",
remote_api_auth_config={
"_type": AuthTypeEnum.BEARER,
"token": "your_api_token_here"
},
request_transformation_expression=None,
response_transformation_expression=None,
response_type="string",
batch_size=32,
multi_turn_support=False,
enable_retry=True
)

Models - Helpers

get_model

get_model(key)

Returns model object based on identifier key.

Method Parameters

key | required string

Unique model identifier key.


Returns

LocalModel or RemoteModel object.

Example

model = dfl.get_model('unique_identifier_key')

Datasets

create_dataset

create_dataset(file_path, name?, test_file_path?, key?)

Creates a new dataset object by uploading a dataset file.

Method Parameters

file_path | required string

Path to file containing dataset. Valid file extensions include .csv, .txt.


test_file_path | optional string

Path to the file containing the test split of the dataset. Required for downstream membership inference tests. Valid file extensions include .csv, .txt.


key | optional string

Unique dataset identifier key. Will be autogenerated if not provided.


name | optional string

Dataset name.


Returns

Dataset object.

Example

dataset = dfl.create_dataset(
file_path="test_datasets/train.csv",
name="Fine-tuning dataset",
)

# with test file path
dataset = dfl.create_dataset(
file_path="data/train.csv",
test_file_path="data/test.csv",
name="Fine-tuning dataset",
)

create_hf_dataset

create_hf_dataset(name, hf_id, hf_token?, key?)

Creates a new dataset object that points to hosted dataset on HuggingFace hub.

Method Parameters

name | optional string

Dataset name.


hf_id | required string

HuggingFace hub id for the dataset. 'train' and 'test' splits are required downstream membership inference tests.


hf_token | optional string

HuggingFace token for the provided dataset id. Required if the dataset is private or gated on the hub.


key | optional string

Unique dataset identifier key. Will be autogenerated if not provided.


Returns

HFDataset object.

Example

dataset = dfl.create_hf_dataset(
name="HF dataset",
hf_id="fka/awesome-chatgpt-prompts"
hf_token="hf_***",
)

Vector Databases

ChromaDB

ChromaDB(host, port, collection, ef_inputs)

Initializes a Chroma vector database connection. Required for tests on RAG workflows.

Method Parameters

host | required string

Host connection for vector database.


port | required int

Port for vector database connection.


collection | required string

Vector database collection name


ef_inputs | required object

Embedding function to be used for vector database.

HuggingFace

api_key | required string

HuggingFace Hub API key to access embedding function.

model_name | required string

HuggingFace Hub embedding function model name.

OpenAI

api_key | required string

OpenAI API key to access embedding function.

model_name | required string

OpenAI embedding function model name.

Azure OpenAI

api_key | required string

Azure OpenAI API key to access embedding function.

model_name | required string

Azure OpenAI embedding function model name.

api_base | required string

Azure OpenAI API base endpoint.

api_version | required string

Azure OpenAI API endpoint version.

Sentence Transformer

model_name | required string

Sentence Transformer embedding function model name.

Returns

ChromaDB object.

Example

chroma_args = {
"host": "chroma-service.chroma.svc.cluster.local",
"port": 8000,
"collection": "my_collection",
"ef_inputs": {
"ef_type": "sentence_transformer",
"model_name": "my_embedding_function",
},
}
chroma_connection = ChromaDB(**chroma_args)

LlamaIndexDB

LlamaIndexDB(aws_key, aws_secret_key, s3_bucket_name, ef_inputs)

Initializes a LlamaIndex vector index connection. Required for tests on RAG workflows.

Method Parameters

aws_key | required string

AWS S3 access key for the persistent directory


aws_secret_key | required int

AWS S3 secret access key for the persistent directory


s3_bucket_name | required string

AWS S3 bucket name for the persistent directory


ef_inputs | required object

Embedding function to be used for vector database.

HuggingFace

api_key | required string

HuggingFace Hub API key to access embedding function.

model_name | required string

HuggingFace Hub embedding function model name.

OpenAI

api_key | required string

OpenAI API key to access embedding function.

model_name | required string

OpenAI embedding function model name.

Azure OpenAI

api_key | required string

Azure OpenAI API key to access embedding function.

model_name | required string

Azure OpenAI embedding function model name.

api_base | required string

Azure OpenAI API base endpoint.

api_version | required string

Azure OpenAI API endpoint version.

Sentence Transformer

model_name | required string

Sentence Transformer embedding function model name.

Returns

LlamaIndexDB object.

Example

llamaindex_arg = {
"aws_key": AWS_KEY, # aws s3 access key
"aws_secret": AWS_SECRET_KEY, # aws s3 secret access key
"s3_bucket_name": "llamaindex-test", # aws s3 bucket name
"ef_inputs": {
"ef_type": "hf", # embedding function provider
"model_name": "all-MiniLM-L6-v2", # embedding function model name
"api_key": HF_AUTH_TOKEN, # credential needed to access the model
},
}
llamaindex_connection = LlamaIndexDB(**llamaindex_args)

CustomRagDB

CustomRagDB(custom_rag_application_id)

Initializes a CustomRagDB connection. Required for tests on RAG workflows.

Method Parameters

custom_rag_application_id | required int

Custom RAG application id


Returns

CustomRagDB object.

Example

custom_rag_arg = {
"custom_rag_application_id": 12 # id of custom-rag-application
}
custom_rag_connection = CustomRagDB(**custom_rag_arg)

Tests - Privacy

create_membership_inference_test

create_membership_inference_test(name, model_key, dataset_id, gpu, input_column, prompts_column?, reference_column?, base_model?, pii_classes?, regex_expressions?)

Creates and orchestrates a new penetration test or evaluation.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


input_column | required string

Input column in the dataset to use for membership inference.


prompts_column | optional string

Column to specify the prompts for the input. For encoder-decoder models only.


reference_column | optional string

Column to specify the reference for the input. For encoder-decoder models only.


base_model | optional string

Base model to use for the attack, can be a HuggingFace hub id.


pii_classes | optional List[string]

PII classes to attack. E.g PERSON.


regex_expressions | optional Dict[str, str]

list of regex expressions to use for extraction.


Returns

Test object.

Example

test_info = dfl.create_membership_inference_test(
name="membership_inference_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
pii_classes=["PERSON", "ORG", "LOC", "DATE"],
regex_expressions={
"USERNAME": r"([a-zA-Z]+_[a-zA-Z0-9]+)",
"EMAIL": r"([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})",
"SSN": r"(\d{3}-\d{2}-\d{4})",
},
input_column="email_body",
reference_column="email_body",
base_model="gpt2",
grid=[
{
"temperature": [1.0, 0.5, 0.7],
}
],
)

create_pii_extraction_test

create_pii_extraction_test(name, model_key, dataset_id, gpu, pii_ref_column, prompts_column?, base_model?, pii_classes?, extraction_prompt?, sampling_rate?, regex_expressions?, grid?)

Creates and orchestrates a new penetration test or evaluation.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


pii_ref_column | required string

Column in the dataset to sample prompts from.


prompts_column | optional string

Column to specify the prompts for the input. For encoder-decoder models only.


base_model | optional string

Optional str for the base model to use for the attack. Can be a HuggingFace hub id or an API instance name.


pii_classes | optional List[string]

PII classes to attack. E.g PERSON.


regex_expressions | optional Dict[str, str]

list of regex expressions to use for extraction.


extraction_prompt | optional string

Prompt for PII extraction. If provided, must be one of "", "dfl_dynamic", or "dfl_ata".


sampling_rate | optional float

Number of times to attempt generating candidates.


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack. Check How to use grid? for further details


Hyperparameters

ParamTypeDefaultDescription
temperaturefloat1.0Model temperature, controls model randomness, should be > 0
seq_lenint256Number of tokens to generate in model response, 256 by default

Returns

Test object.

Example

test_info = dfl.create_pii_extraction_test(
name="pii_extraction_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=VRAMConfig(vramGB=16),
sampling_rate=128,
pii_classes=["PERSON", "ORG"],
pii_ref_column="email_body",
grid=[
{
"seq_len": [256],
}
],
)

create_pii_inference_test

create_pii_inference_test(name, model_key, dataset_id, gpu, pii_ref_column, prompts_column?, base_model?, pii_classes?, regex_expressions?, num_targets?, candidate_size?, sample_and_shuffle?, grid?)

Creates and orchestrates a new penetration test or evaluation.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


pii_ref_column | required string

Column in the dataset to sample prompts from.


prompts_column | optional string

Column to specify the prompts for the input. For encoder-decoder models only.


base_model | optional string

Optional str for the base model to use for the attack. Can be a HuggingFace hub id or an API instance name.


pii_classes | optional List[string]

PII classes to attack. E.g PERSON.


regex_expressions | optional Dict[str, str]

list of regex expressions to use for extraction.


num_targets | optional int

Number of target sequence to sample to attack.


candidate_size | optional int

Number of PII candidates to sample randomly for the attack.


sample_and_shuffle | optional int

Number of times to sample and shuffle candidates.


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack. Check How to use grid? for further details.


Hyperparameters

ParamTypeDefaultDescription
temperaturefloat1.0Model temperature, controls model randomness, should be > 0
seq_lenint256Number of tokens to generate in model response, 256 by default
target_sequence_scopestrentire_sampledetermines the breadth of the sentence length for the model

Returns

Test object.

Example

test_info = dfl.create_pii_inference_test(
name="pii_inference_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=VRAMConfig(vramGB=16),
num_targets=32,
pii_classes=["PERSON", "ORG"],
pii_ref_column="email_body",
grid=[
{
"seq_len": [256],
}
],
)

create_pii_reconstruction_test

create_pii_reconstruction_test(name, model_key, dataset_id, gpu, pii_ref_column, prompts_column?, base_model?, pii_classes?, regex_expressions?, num_targets?, candidate_size?, sampling_rate?, grid?)

Creates and orchestrates a new penetration test or evaluation.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


pii_ref_column | required string

Column in the dataset to sample prompts from.


prompts_column | optional string

Column to specify the prompts for the input.


base_model | optional string

Optional str for the base model to use for the attack. Can be a HuggingFace hub id or an API instance name.


pii_classes | optional List[string]

PII classes to attack. E.g PERSON.


regex_expressions | optional Dict[str, str]

list of regex expressions to use for extraction.


num_targets | optional int

Number of target sequence to sample to attack.


candidate_size | optional int

Number of PII candidates to sample randomly for the attack.


sampling_rate | optional float

Number of times to attempt generating candidates.


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack. Check How to use grid? for further details.


Hyperparameters

ParamTypeDefaultDescription
temperaturefloat1.0Model temperature, controls model randomness, should be > 0
seq_lenint256Number of tokens to generate in model response, 256 by default
target_sequence_scopestrentire_sampledetermines the breadth of the sentence length for the model

Returns

Test object.

Example

test_info = dfl.create_pii_reconstruction_test(
name="pii_reconstruction_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
sampling_rate=128,
num_targets=32,
pii_classes=["PERSON", "ORG"],
pii_ref_column="email_body",
grid=[
{
"temperature": [0.5, 1],
"seq_len": [256],
}
],
)

create_sequence_extraction_test

create_sequence_extraction_test(name, model_key, dataset_id, gpu, memorization_granularity, sampling_rate, is_finetuned, base_model?, title?, title_column?, text_column?, source?, grid?)

Creates a sequence extraction test on a model with a dataset to evaluate memorization.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


memorization_granularity | required string

Granularity of memorization. E.g paragraph, sentence


sampling_rate | required string

Number of times to attempt generating candidates.


is_finetuned | required bool

Whether the model is finetuned or not; determines whether to generate the fine-tuned or the base model report


base_model | optional string

Base model to use for the attack, can be a HuggingFace hub id or an API instance name.


title | optional string

Title to use for the attack


title_column | optional string

Name of column containing the title of the document, if dataset is csv


text_column | optional string

Name of column containing the text of the document, if dataset is csv


source | optional string

Source of the dataset e.g. NYT


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDescription
temperaturefloatModel temperature, controls model randomness, should be > 0
seq_lenintNumber of tokens to generate in model response, 256 by default
prompt_lengthintLength of the prefix/suffix being used to prompt the model

Returns

Test object.

Example

test_info = dfl.create_sequence_extraction_test(
name="sequence_extraction_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
memorization_granularity="paragraph",
source="NYT",
grid=[
{
"temperature": [0],
"seq_len": [256],
"prompt_length": [40]
}
],
)

Tests - Performance and Hallucinations

create_hallucination_test

create_hallucination_test(name, model_key, dataset_id, gpu, hallucination_metrics, input_column, topic_list?, prompts_column?, reference_column, grid?)

Creates and orchestrates a new penetration test or evaluation.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


hallucination_metrics | required List[string]

Hallucation metrics used. E.g nli-consistency, unieval-factuality


input_column | required string

Input column in the dataset to use for hallucination test


topic_list | optional List[string]

List of topics to cluster the result


prompts_column | optional string

Column to specify the prompts for the input


reference_column | optional string

Column to specify the reference for the input


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack. Check How to use grid? for further details


Hyperparameters

ParamTypeDefaultDescription
temperaturefloat1.0Model temperature, controls model randomness, should be > 0
seq_lenint256Number of tokens to generate in model response, 256 by default

Returns

Test object.

Example

test_info = dfl.create_hallucination_test(
name="hallucination_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
hallucination_metrics=["nli-consistency", "unieval-factuality"],
input_column="document",
grid=[
{
"temperature": [1.0, 0.7],
}
],
)

create_performance_test

create_performance_test(name, model_key, dataset_id, gpu, performance_metrics, input_column, topic_list?, prompts_column?, reference_column?, grid?)

Creates and orchestrates a new penetration test or evaluation.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


performance_metrics | required List[string]

List of metrics to calculate for performance test, options include: [’rouge’, ‘bertscore’]


input_column | required string

Input column in the dataset to use for performance evaluation


topic_list | optional List[string]

List of topics to cluster the result


prompts_column | optional string

Column to specify the prompts for the input


reference_column | optional string

Column to specify the reference for the input


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDefaultDescription
temperaturefloat1.0Model temperature, controls model randomness, should be > 0
seq_lenint256Number of tokens to generate in model response, 256 by default

How to use grid?

The combinations are generated across the parameters supplied in each dict present in the list.

Scenario 1 Input =>

grid=[{
"seq_len": [128],
"temperature": [0.5]
}]

Output => 1 Attack

{"seq_len": 128, "temperature": 0.5}

Scenario 2 Input =>

grid=[
{"seq_len": [128]},
{"temperature": [0.5]}
]

Output => 2 attacks with the following configs

Attack 1 -> {"seq_len": 128}
- Default value for temperatue is used in this attack

Attack 2 -> {"temperature": 0.5}
- Default value for seq_len is used in this attack

Returns

Test object.

Example

test_info = dfl.create_performance_test(
name="performance_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
performance_metrics=["rouge", "bertscore"],
input_column="document",
grid=[
{
"temperature": [1.0, 0.5],
"seq_len": [64]
},
{
"seq_len": [64, 256]
}
],
)

create_rag_hallucination_test

create_rag_hallucination_test(name, model_key, dataset_id, gpu, rag_hallucination_metrics, input_column, example_column?, prompts_column?, prompt_template?, topic_list?, vector_db?, grid?)

Creates and orchestrates a new penetration test or evaluation.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


dataset_id | required string

Unique identifier of dataset object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


rag_hallucination_metrics | required List[string]

List of metrics to be calculated during RAG evaluation test, see below for valid options. For more details, please see the appendix.

  • RAG Hallucination Metric Options
    • retrieval-relevance | Evaluate the relevance of documents retrieved from the vectorDB using your embedding model
    • response-relevance | Evaluate the relevance of model generated responses to input queries
    • faithfulness | Evaluate the faithfulness of model generated responses to the retrieved document context

input_column | required string

Input column in the dataset to use for rag_hallucination evaluation


example_column | optional string

Example column used for few shot examples


prompts_column | optional string

Column to specify the prompts for the input


prompt_template | optional string

Prompt template to use for the attack


topic_list | optional List[string]

List of topics to cluster the result


vector_db | required VectorDB

Vector database object to be used in RAG workflow. Supported vector database types:

  • ChromaDB
  • CustomRagDB
  • LlamaIndexDB
  • LlamaIndexWithChromaDB
  • PostgresVectorDB

grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack. Check How to use grid? for further details


Hyperparameters

ParamTypeDefaultDescription
temperaturefloat1.0Model temperature, controls model randomness, should be > 0
seq_lenint256Number of tokens to generate in model response, 256 by default
retrieve_top_kint3Number of documents to be retrieved from vector database to be used for evaluation and provide at context. The recommended number is 3. Must be > 2
  • Each dict of the outer element is the list of the combinations to try grid =

Returns

Test object.

Example

prompt_template = """Answer the question based only on the following context: {context}\n\nQuestion: {question}\n"""  
chroma_args = {
"host": "abc-host",
"port": 8000,
"collection": "multidoc2dial",
"ef_inputs": {
"ef_type": "sentence_transformer",
"model_name": "all-MiniLM-L6-v2",
},
}

test_info = dfl.create_rag_hallucination_test(
name="rag_hallucination_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
input_column="queries_pp",
gpu=VRAMConfig(vramGB=16),
rag_hallucination_metrics=["retrieval-relevance", "response-relevance", "faithfulness"],
topic_list=[
"driver license registration",
"student scholarship and financial aid",
"eligibility and benefits for disabled and registration process",
"social security services and retirement plans",
],
prompt_template=prompt_template,
vector_db=chroma_args,
grid=[
{
"temperature": [1.0],
"seq_len": [256],
"retrieve_top_k": [3],
}
],
)

Tests - Compliance and Security

create_cybersecurity_compliance_test

create_cybersecurity_compliance_test(name, model_key, gpu, base_model?, sampling_rate?, grid?)

Creates and orchestrates a new Cybersecurity Compliance test on a specified model

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


base_model | optional string

Base model to use for the attack, can be a HuggingFace hub id or an API instance name.


sampling_rate | optional int

Number of attack prompts we feed the model. Default is 50.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDescription
temperaturefloatModel temperature, controls model randomness, should be > 0

Returns

Test object.

Example

test_info = dfl.create_cybersecurity_compliance_test(
name="test{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)

create_static_jailbreak_test

create_static_jailbreak_test(name, model_key, gpu, fast_mode, dataset_id?, grid?)

Create a static jailbreak test on a model

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


dataset_id | optional str

Id of the dataset to be used. If not provided, the test will default to the v0 dataset, which is a small dataset with 50 prompts for testing purposes:

https://github.com/patrickrchao/JailbreakingLLMs/blob/main/data/harmful_behaviors_custom.csv

If using a custom dataset, ensure that the dataset has the following columns:

  • "goal": the prompt
  • "category": the category of the prompt
  • "shortened_prompt": the goal column shortened to 1-2 words (used for encoding attack and ascii art attack)
  • "gcg": the prompt that includes the gcg suffix

grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDescription
temperaturefloatModel temperature, controls model randomness, should be > 0

Returns

Test object.

Example

test_info = dfl.create_static_jailbreak_test(
name="static_jailbreak_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.V100, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)

create_bias_toxicity_test

create_bias_toxicity_test(name, model_key, gpu, fast_mode, grid?)

Create a bias/toxicity test on a model

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDescription
temperaturefloatModel temperature, controls model randomness, should be > 0

Returns

Test object.

Example

test_info = dfl.create_bias_toxicity_test(
name="bias_toxicity_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.V100, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)

create_adaptive_jailbreak_test

create_adaptive_jailbreak_test(name, model_key, gpu, fast_mode, dataset_id?, grid?)

Create an adaptive jailbreak test on a model.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


gpu | required GPUSpecification

GPUSpecification object identifying GPU configurations for test.


dataset_id | optional str

ID of the dataset to be used. If not provided, the test will default an internal attack dataset, which is a dataset comprising of 50 adversarial prompts.

If using a custom dataset, ensure that the dataset has the following columns:

  • "goal": the prompt
  • "target": the target column

grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDescription
temperaturefloatModel temperature, controls model randomness, should be > 0

Returns

Test object.

Example

test_info = dfl.create_adaptive_jailbreak_test(
name="create_adaptive_jailbreak_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.V100, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)

create_prompt_extraction_test

create_prompt_extraction_test(name, model_key, gpu?, grid?)

Creates a system prompt extraction test on a model to evaluate if the model leaks its system prompt.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


gpu | optional GPUSpecification

GPUSpecification object identifying GPU configurations for test. Required for local models, optional for remote models (defaults to A10G).


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDescription
temperaturefloatModel temperature, controls model randomness, should be > 0

Returns

Test object.

Example

test_info = dfl.create_prompt_extraction_test(
name="prompt_extraction_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)

create_multilingual_jailbreak_test

create_multilingual_jailbreak_test(name, model_key, language, gpu?, grid?)

Creates a multilingual jailbreak test on a model to evaluate its safety across different languages.

Method Parameters

name | required string

Test identifier name.


model_key | required string

Unique identifier of model object that test will be run on.


language | required string

Language to test the model in. Currently supports Japanese (ja).


gpu | optional GPUSpecification

GPUSpecification object identifying GPU configurations for test. Required for local models, optional for remote models (defaults to A10G).


grid | optional List[Dict[str, List[str | float | int]]]

Grid of hyperparameters supported for this attack


Hyperparameters

ParamTypeDescription
temperaturefloatModel temperature, controls model randomness, should be > 0

Returns

Test object.

Example

test_info = dfl.create_multilingual_jailbreak_test(
name="multilingual_jailbreak_test_{}".format(SLUG).format(),
model_key=model.key,
language="ja", # for Japanese
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)

Tests - Helpers

get_attack_info

get_attack_info(attack_id)

Returns attack object status.

Method Parameters

attack_id | required string

Unique attack identifier.

Returns

Attack result JSON object.

Example

all_attacks = test_info.attacks
attack_ids = [attack["id"] for attack in all_attacks]
for attack in attack_ids:
attack_info = dfl.get_attack_info(attack)
# Example Response:
# {'id': '6566d2718cf68d15c393ff0d',
# 'status': 'COMPLETED',
# 'failureReason': None,
# 'response': {
# 'metrics': {
# 'precision': 0.023429541595925297,
# 'recall': 0.014047231270358305,
# 'pii_intersection_per_category': {'DATE': 57, 'ORG': 6, 'PERSON': 6},
# 'dataset_pii_per_category': {'ORG': 1848, 'EMAIL': 494, 'USERNAME': 1130, 'DATE': 518, 'PERSON': 922},
# 'dataset_pii_category_count': 5,
# 'dataset_top_3_categories': ['ORG', 'USERNAME', 'PERSON'],
# 'extracted_pii_per_category': {'DATE': 568, 'EMAIL': 424, 'USERNAME': 1120, 'PERSON': 721, 'ORG': 112},
# 'samples': [{'prompt': '', 'response': "..."}, {...}],
# 'model_type': 'decoder'
# },
# 'inferences_location': 's3://dynamofl-pentest-prod/attacks/output/naive_extraction_1701238142.json',
# 'resolved_args': {'attack_args': {...}
# }
# }
# 'testId': '6566d2718cf68d15c393ff05'
# }

Custom RAG Adapter

DynamoEval provides the following APIs to manage custom RAG applications:

create_custom_rag_application

Creates and registers a new Custom RAG Application with specified configurations.

Method Parameters

base_url | required string

The base URL for the RAG application.


auth_type | required AuthTypeEnum

Authentication type (AuthTypeEnum.NO_AUTH, AuthTypeEnum.BEARER).


auth_config | optional dict

Authentication configuration parameters.


custom_rag_application_routes | optional list

List of route configurations.

Returns

CustomRagApplicationResponseEntity object.

You can create a Custom RAG Application in several ways:

  1. Basic creation without authentication:
from dynamofl.entities import AuthTypeEnum

custom_rag_app = dfl.create_custom_rag_application(
base_url="https://api.example.com",
auth_type=AuthTypeEnum.NO_AUTH
)
  1. Creation with authentication configuration:
from dynamofl.entities import AuthTypeEnum

custom_rag_app = dfl.create_custom_rag_application(
base_url="https://api.example.com",
auth_type=AuthTypeEnum.BEARER,
auth_config={
"token": "bearer-token"
}
)
  1. Creation along with route configuration:
from dynamofl.entities import CustomRagApplicationRoutesEntity, AuthTypeEnum, RouteTypeEnum

custom_rag_application_routes = [
CustomRagApplicationRoutesEntity(
route_type=RouteTypeEnum.RETRIEVE,
route_path="/retrieve",
request_transformation_expression=None,
response_transformation_expression=None,
)
]

custom_rag_app = dfl.create_custom_rag_application(
base_url="https://api.example.com",
auth_type=AuthTypeEnum.NO_AUTH,
custom_rag_application_routes=custom_rag_application_routes
)

update_custom_rag_application

Updates an existing Custom RAG Application's base configuration.

Method Parameters

custom_rag_application_id | required int

The unique identifier of the RAG application to update.


base_url | required string

The new base URL for the RAG application.


auth_type | required AuthTypeEnum

Authentication type (AuthTypeEnum.NO_AUTH, AuthTypeEnum.BEARER).


auth_config | optional dict

The new authentication configuration parameters.

Returns: CustomRagApplicationResponseEntity

from dynamofl.entities import AuthTypeEnum

updated_app = dfl.update_custom_rag_application(
custom_rag_application_id=123,
base_url="https://api.example.com",
auth_type=AuthTypeEnum.BEARER,
auth_config={"token": "bearer-token"}
)

get_all_custom_rag_applications

Retrieves all registered Custom RAG Applications.

Method Parameters

include_routes | optional bool

Whether to include route details in the response. Defaults to False.

Returns: AllCustomRagApplicationResponseEntity

all_apps = dfl.get_all_custom_rag_applications(include_routes=True)

get_custom_rag_application

Retrieves details of a specific Custom RAG Application.

Method Parameters

custom_rag_application_id | required int

The unique identifier of the RAG application to retrieve.


include_routes | optional bool

Whether to include route details in the response. Defaults to False.

Returns: List[CustomRagApplicationResponseEntity]

app = dfl.get_custom_rag_application(
custom_rag_application_id=123,
include_routes=True
)

delete_custom_rag_application

Removes a Custom RAG Application from the system.

Method Parameters

custom_rag_application_id | required int

The unique identifier of the RAG application to delete.

Returns: None

dfl.delete_custom_rag_application(custom_rag_application_id=123)

create_route

Adds a new route to an existing Custom RAG Application.

Method Parameters

custom_rag_application_id | required int

The ID of the RAG application to which the route belongs.


route_type | required RouteTypeEnum

Route type (RouteTypeEnum.RETRIEVE).


route_path | required string

The URL path defining the route.


request_transformation_expression | optional string

JSONata expression to transform incoming requests.


response_transformation_expression | optional string

JSONata expression to transform outgoing responses.

Returns: List[CustomRagApplicationRoutesResponseEntity]

from dynamofl.entities import RouteTypeEnum

new_route = dfl.create_custom_rag_application_route(
custom_rag_application_id=123,
route_type=RouteTypeEnum.RETRIEVE,
route_path="/retrieve",
request_transformation_expression="...",
response_transformation_expression="..."
)

update_route

Updates an existing route in a Custom RAG Application.

Method Parameters

custom_rag_application_id | required int

The ID of the RAG application to which the route belongs.


route_id | required int

The unique identifier of the route to update.


route_type | required RouteTypeEnum

Route type (RouteTypeEnum.RETRIEVE).


route_path | required string

The new URL path for the route.


request_transformation_expression | optional string

JSONata expression to transform incoming requests.


response_transformation_expression | optional string

JSONata expression to transform outgoing responses.

Returns: CustomRagApplicationRoutesResponseEntity

updated_route = dfl.update_custom_rag_application_route(
custom_rag_application_id=123,
route_id=456,
route_type=RouteTypeEnum.RETRIEVE,
route_path="/new-search",
request_transformation_expression="...",
response_transformation_expression="..."
)

delete_route

Removes a specific route from a Custom RAG Application.

Method Parameters

custom_rag_application_id | required int

The ID of the RAG application from which to delete the route.


route_id | required int

The unique identifier of the route to delete.

Returns: None

dfl.delete_custom_rag_application_route(
custom_rag_application_id=123,
route_id=456
)